Generating information-rich high-throughput experimental materials genomes using functional clustering via multitree genetic programming and information theory.

نویسندگان

  • Santosh K Suram
  • Joel A Haber
  • Jian Jin
  • John M Gregoire
چکیده

High-throughput experimental methodologies are capable of synthesizing, screening and characterizing vast arrays of combinatorial material libraries at a very rapid rate. These methodologies strategically employ tiered screening wherein the number of compositions screened decreases as the complexity, and very often the scientific information obtained from a screening experiment, increases. The algorithm used for down-selection of samples from higher throughput screening experiment to a lower throughput screening experiment is vital in achieving information-rich experimental materials genomes. The fundamental science of material discovery lies in the establishment of composition-structure-property relationships, motivating the development of advanced down-selection algorithms which consider the information value of the selected compositions, as opposed to simply selecting the best performing compositions from a high throughput experiment. Identification of property fields (composition regions with distinct composition-property relationships) in high throughput data enables down-selection algorithms to employ advanced selection strategies, such as the selection of representative compositions from each field or selection of compositions that span the composition space of the highest performing field. Such strategies would greatly enhance the generation of data-driven discoveries. We introduce an informatics-based clustering of composition-property functional relationships using a combination of information theory and multitree genetic programming concepts for identification of property fields in a composition library. We demonstrate our approach using a complex synthetic composition-property map for a 5 at. % step ternary library consisting of four distinct property fields and finally explore the application of this methodology for capturing relationships between composition and catalytic activity for the oxygen evolution reaction for 5429 catalyst compositions in a (Ni-Fe-Co-Ce)Ox library.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of a Number of Genes Affecting in Milk Production using Information Theory and Mutual Information

Information theory is a branch of mathematics. Information theory is used in genetic and bioinformatics analyses and can be used for many analyses related to the biological structures and sequences. Bio-computational grouping of genes facilitates genetic analysis, sequencing and structural-based analyses. In this study, after retrieving gene and exon DNA sequences affecting milk yield in dairy ...

متن کامل

Solving product mix problem in multiple constraints environment using goal programming

The theory of constraints is an approach to production planning and control that emphasizes on the constraints to increase throughput by effectively managing constraint resources. One application in theory of constraints is product mix decision. Product mix influences the performance measures in multi-product manufacturing system. This paper presents an alternative approach by using of goal pro...

متن کامل

Molecular Typing of Mycobacterium Tuberculosis Isolated from Iranian Patients Using Highly Abundant Polymorphic GC-Rich-Repetitive Sequence

Background: Tuberculosis (TB) with more than 10 million new cases per year and one of the top 10 causes of death worldwide, is still one of the most important global health problems. Also, multi drug-resistant tuberculosis (MDR) is a serious danger to public health. Understanding of the epidemiological pattern of mycobacterium tuberculosis (MTB), Estimates of recent transmission and recurrence ...

متن کامل

Intrusion Detection based on a Novel Hybrid Learning Approach

Information security and Intrusion Detection System (IDS) plays a critical role in the Internet. IDS is an essential tool for detecting different kinds of attacks in a network and maintaining data integrity, confidentiality and system availability against possible threats. In this paper, a hybrid approach towards achieving high performance is proposed. In fact, the important goal of this paper ...

متن کامل

Multi-scale analysis and clustering of co-expression networks

The increasing capacity of high-throughput genomic technologies for generating time-course data has stimulated a rich debate on the most appropriate methods to highlight crucial aspects of data structure. In this work, we address the problem of sparse co-expression network representation of several timecourse stress responses in Saccharomyces cerevisiae. We quantify the information preserved fr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • ACS combinatorial science

دوره 17 4  شماره 

صفحات  -

تاریخ انتشار 2015